The impact of isolation kernel on agglomerative hierarchical clustering algorithms

نویسندگان

چکیده

Agglomerative hierarchical clustering (AHC) is one of the popular approaches. AHC generates a dendrogram that provides richer information and insights from dataset than partitioning clustering. However, major problem with existing distance-based methods is: it fails to effectively identify adjacent clusters varied densities, regardless cluster extraction applied resultant dendrogram. This paper aims reveal root cause this issue solution by using data-dependent kernel. We analyse condition under which fail extract clusters, give reason why kernel an effective remedy. leads new approach kernerlise algorithms including traditional algorithms, HDBSCAN, GDL, PHA HC-OT. Our extensive empirical evaluation shows recently introduced Isolation Kernel produces higher quality or purer distance, Gaussian adaptive in all above mentioned algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modern hierarchical, agglomerative clustering algorithms

This paper presents algorithms for hierarchical, agglomerative clustering which perform most efficiently in the general-purpose setup that is given in modern standard software. Requirements are: (1) the input data is given by pairwise dissimilarities between data points, but extensions to vector data are also discussed (2) the output is a “stepwise dendrogram”, a data structure which is shared ...

متن کامل

2 Review of Agglomerative Hierarchical Clustering Algorithms

Hierarchical methods are well known clustering technique that can be potentially very useful for various data mining tasks. A hierarchical clustering scheme produces a sequence of clusterings in which each clustering is nested into the next clustering in the sequence. Since hierarchical clustering is a greedy search algorithm based on a local search, the merging decision made early in the agglo...

متن کامل

Fuzzification of Agglomerative Hierarchical Crisp Clustering Algorithms

User generated content from fora, weblogs and other social networks is a very fast growing data source in which different information extraction algorithms can provide a convenient data access. Hierarchical clustering algorithms are used to provide topics covered in this data on different levels of abstraction. During the last years, there has been some research using hierarchical fuzzy algorit...

متن کامل

Agglomerative hierarchical kernel spectral clustering for large scale networks

We propose an agglomerative hierarchical kernel spectral clustering (AH-KSC) model for large scale complex networks. The kernel spectral clustering (KSC) method uses a primal-dual framework to build a model on a subgraph of the network. We exploit the structure of the projections in the eigenspace to automatically identify a set of distance thresholds. These thresholds lead to the different lev...

متن کامل

A Comparison of Agglomerative Hierarchical Algorithms for Modularity Clustering

Modularity is a popular measure for the quality of a cluster partition. Primarily, its popularity originates from its suitability for community identification through maximization. A lot of algorithms to maximize modularity have been proposed in recent years. Especially agglomerative hierarchical algorithms showed to be fast and find clusterings with high modularity. In this paper we present se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition

سال: 2023

ISSN: ['1873-5142', '0031-3203']

DOI: https://doi.org/10.1016/j.patcog.2023.109517